{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Building RAG pipelines with Optimized Embedding Models\n",
    "\n",
    "In the following notebook we will show how to utilize two fastRAG components that use an optimized and quantized bi-encoder.\n",
    "\n",
    "We will showcase `QuantizedBiEncoderRetriever` for embedding documents in a vectors store, and `QuantizedBiEncoderRanker` for re-ranking documents in a retrieval pipeline.\n",
    "\n",
    "**NOTE**: Please read carefuly the [guide](../scripts/optimizations/embedders/README.md) we provided on how to maximize the speed/latency on Intel Xeon backends."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, lets build an index. We define the embedding dimension to be as the embedding model, and `return_embedding=True` so we could look at the embeddings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack.document_stores import InMemoryDocumentStore\n",
    "\n",
    "document_store = InMemoryDocumentStore(use_gpu=False, use_bm25=False, embedding_dim=384, return_embedding=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack.schema import Document\n",
    "\n",
    "# 3 example documents to index\n",
    "examples = [\n",
    "    \"There is a blue house on Oxford street\",\n",
    "    \"Paris is the capital of France\",\n",
    "    \"fastRAG had its first commit in 2022\"\n",
    "]\n",
    "\n",
    "documents = []\n",
    "for i, d in enumerate(examples):\n",
    "    documents.append(Document(content=d, id=i))\n",
    "\n",
    "document_store.write_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Initialize and load an optimized embedding model into a Bi-encoder retriever."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastrag.retrievers import QuantizedBiEncoderRetriever\n",
    "\n",
    "retriever = QuantizedBiEncoderRetriever(document_store=document_store, embedding_model=\"Intel/bge-small-en-v1.5-rag-int8-static\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Update the embedding vectors of all documents in the index with encoder. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "document_store.update_embeddings(retriever=retriever)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can look at the embedding vectors stores in the index. For example, lets look at the first document's embedding vector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "docs = document_store.get_all_documents()\n",
    "docs[0].embedding.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Adding an optimized ranker\n",
    "\n",
    "We can add an optimized ranker to re-order the documents coming from the retriever. \n",
    "Note that this is component has no dependencies on the previous retrieval steps. It takes the document content and query, and encodes all to vectors to be re-ordered by ordering the similarities in a descending order."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastrag.rankers import QuantizedBiEncoderRanker\n",
    "\n",
    "ranker = QuantizedBiEncoderRanker(\"Intel/bge-small-en-v1.5-rag-int8-static\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Combining all into a pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack import Pipeline\n",
    "\n",
    "p = Pipeline()\n",
    "p.add_node(component=retriever, name=\"retriever\", inputs=[\"Query\"])\n",
    "p.add_node(component=ranker, name=\"ranker\", inputs=[\"retriever\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "p.run(query=\"What is Paris?\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "opt",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
